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Full QCD Algorithms towards the Chiral Limit 

M. Hasenbusch a 

a NIC/DESY Zeuthen, Platanenallee 6, D-15738 Zeuthen, Germany 

I discuss the behaviour of algorithms for dynamical fermions as the sea-quark mass decreases. I focus on the 
Hybrid-Monte-Carlo (HMC) algorithm applied to two degenerate flavours of Wilson fermions. First, I briefly 
review the performance obtained in large scale HMC simulations. Then f discuss a modified pseudo-fermion 
action for the HMC simulation that has been introduced three years ago. f summarize recent results obtained 
with this pseudo-fermion action by the QCDSF and the ALPHA collaborations. I comment on alternatives to 
the HMC, like the Multiboson algorithm and variants of it. 



1. INTRODUCTION 

Today it is understood that further progress in 
the simulation of lattice QCD requires dynamical 
fermions. First large scale simulations with two 
flavours of degenerate Wilson fermions are done 
at a rather coarse lattice spacing a « 0.1 and 
are restricted to mps/my > 0.5. It is doubtful 
whether x-perturbation theory can close the gap 
to the physical value m w /m p f» 0.18 1 . It seems 
that, for a given numerical effort, Kogut-Susskind 
(KS) fermions allow to reach smaller sea-quark 
masses than Wilson fermions For a critical 
review see Yet, biased by my personal expe- 
rience, I shall restrict the following discussion to 
Wilson fermions. I expect that some statements 
also apply to KS fermions. To assess the progress 
that can be made with the machines of the near 
future, like QCDOC or apeNEXT, we have to un- 
derstand the scaling of the algorithms at hand. 
Also it is clear that we should not only rely on 
the speed-up of computers but also should work 
on the simulation algorithms. Recent progress 
pfTo] with the Hybrid-Monte-Carlo (HMC) algo- 
rithm |S] and new ideas [7] presented earlier this 
year show that it is worth while to pursue this 
direction. For simplicity, we shall focus in the fol- 
lowing on two degenerate flavours of sea-quarks. 
The corresponding partition function is given by 
Z = J D[U] exp[-S G [U]) detM[U} 2 , where 
Sg[U] is the gauge action and M[Z7] the fermion 
matrix. For the definition see textbooks. Since 



the explicit calculation of the determinant of the 
fermion matrix is not feasible for reasonably large 
lattices, so called pseudo-fermion fields are intro- 
duced |H]: 



detM[C/] 2 oc 

D[4>] [ D[0t]exp(-^[C/»t]) , (i) 



where Spf = |M[t/] _1 0| 2 is the pseudo-fermion 
action. Since the inverse of the fermion matrix 
appears, the pseudo-fermion action Spf is non- 
local and hence a local up-date of the gauge-field 
requires 0(Volume) operations, where Volume is 
the number of sites of the lattice. It follows that 
a full sweep over the lattice costs 0(Volume 2 ) 
operations. To avoid this unfavourable scaling of 
the costs with the Volume, molecular dynamics 
evolution of the gauge field and eventually the 
HMC algorithm were introduced. 

Later the multi-boson algorithm [H] was intro- 
duced. The inverse of the fermion matrix is ap- 
proximated by some polynomial. This polyno- 
mial is factorized into its roots and for each factor, 
a boson-field is introduced. This way, the effective 
action becomes local. It can be simulated with a 
standard local Monte Carlo algorithm. However, 
it turned out that autocorrelation times increase 
considerably with increasing order of the polyno- 
mial. 
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2. THE HMC ALGORITHM 

To allow the collective molecular dynamics evo- 
lution of the gauge-fields, new auxiliary variables, 
traceless Hermitian momenta P x ,n conjugate to 
the gauge-field are introduced. The resulting 
effective Hamiltonian consists of the gauge ac- 
tion, the pseudo-fermion action and a new part 
for P: H[U, 0, <ft,P\ = S G (U) + S PF [U, 0, tf] + 
5 J2x a ^Pxtn ■ ® ne elementary update ("trajec- 
tory" ) of the HMC algorithm is composed of the 
following steps: 

a) Global heat-bath of the pseudo-fermions and 
the conjugate momenta. 

b) Molecular dynamics evolution of the gauge- 
field U and the conjugate momenta P with fixed 
pseudo-fermions <p. 

c) Accept/Reject step: the gauge- field V 
that is generated by the molecular dynam- 
ics evolution is accepted with the proba- 
bility P acc = min[l,cxp(-H[[/',0,0t ; P'] + 
H(U,<f>,<ft,P])], where P' represents the conju- 
gate momenta generated in the molecular dynam- 
ics evolution. 

The last step is needed, since the molecu- 
lar dynamics evolution can not be done ex- 
actly and hence requires a numerical integration 
scheme, resulting in AH = H[U\ <j>, <jy , P'] — 
H(U,<j>,<ft,P])]?Q. 

2.1. Integration Schemes 

The HMC algorithm requires that the integra- 
tion scheme used is area preserving and reversible. 
The simplest scheme that fulfills these require- 
ments is the so called "leap-frog" algorithm. Let 
us define the update of the gauge-field and the 
momenta as 

Tu(St) : U^e lSrP U 

T p {5t) : P -> P -%St S(S g [U] + S PF [U]) . 

Then, the elementary step of the leap-frog algo- 
rithm is given by 

T2(Sr) = T P (j) TuiSr) T P . (2) 

A trajectory is composed of N m d consecutive el- 
ementary steps. Its integration error is 0(5t 2 ) 
for a trajectory of a given length, where St is the 



step-size. Note that the order of the updates of 
momenta and gauge-fields is not unique. In fact, 
in ref. \F3\ it was demonstrated that the alter- 
native order T^(St) = T v (4^) T p {St) T v (^) 
achieves the same acceptance rate as eq. (|5J (see 
fig. 2 of ref. JUj) with a roughly 15% larger step 
size St. In the literature [11)111112] also higher- 
order schemes are discussed. These schemes be- 
come increasingly complicated as the order in- 
creases. Recent studies, e.g. |13I14) . with the 
standard pseudo-fermion action show that 
higher-order schemes become less efficient than 
the simple leap-frog integration scheme as the sea- 
quark mass decreases. 

Sexton and Weingarten [TO] also suggested to 
split the update of the momenta into two parts: 

T pg {8t) : P^P-iSTSuSgp] 
T pf {8t) : P -> P -iSr S V S PF [U} . 

The leap-frog scheme is now generalized to 
T 2 (u,5t) = T PF (j) (3) 

M£M£M£)]"*'(t) 

This allows to put a part of the action that is 
simple to compute (e.g. Sq) on a smaller time- 
scale than the rest. 

2.2. Preconditioning 

It has been noticed for a while that precon- 
ditioning of the fermion matrix is indispensable 
in lattice QCD simulations. First, the number 
of iterations needed to solve M^ 1 ^ is reduced, 
but also the step-size of the HMC algorithm can 
be increased (See e.g. USD- In addition to the 
even-odd preconditioning, SSOR preconditioning 
and variants of it ^jl have been used in prac- 
tice. The authors of ref. J7| have detailed 
how even-odd preconditioning can be applied to 
clover- improved Wilson fermions. They propose 
two possible variants. The asymmetric version: 

detM oc det(l ee + T ee ) detM , (4) 

where M = 1 00 + T 00 - H oe (l ee + T ee y 1 H eo . H oe 
and H eo is the hopping-matrix, connecting odd 
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with even sites and vice versa. T OQ and T ee encode 
the clover term. The symmetric version is 

detM oc det(l ee + T ee )det(l 00 + T 00 )detM sym , (5) 

where now M sym = l ao - (l OQ + r oo ) _1 M oe (l ee + 
T ee )~ l M eo . In most simulations, the asymmet- 
ric version J3J has been used. However, recently 
ref. U,9j reported that the symmetric version J5| 
allows for a considerably larger step-size of the 
leap-frog than 

3. SCALING OF THE HMC 

At the Lattice 2001 T. Lippert [THj reported 
about the scaling of the HMC algorithm based 
on the simulations of the SESAM collaboration. 
The result can be summarized as 

/r\5 / -, \ 2.8(2) 

N '»» = 2 - 3(7) " 107 '(a ) i^Ts) 

/r\5 / -I \ 4.3(2) 

N flops = 1.6(4) -10 7 - - • 

\aj \ampsj 

for P = 5.6 and 5.5, respectively. The increase 
of the costs with decreasing mass mps of the 
pseudo-scalar meson has three sources: 

a) Since the condition number of the fermion 
matrix increases as mps decreases, the solver 
needs more iterations to compute M[£/] _1 0. 

b) Decrease of the step-size St of the leap-frog 
integration scheme. 

c) Increase of the autocorrelation time. 

Here I like to briefly discuss the behaviour in 
two recent large scale simulations using clover- 
improved Wilson fermions. The CP-PACS col- 
laboration ^2] has simulated the Iwasaki gauge 
action at three /3-values corresponding to the lat- 
tice spacings a « 0.22,0.16 and 0.11 fm. The 
JLQCD collaboration JHj collaboration simu- 
lated the Wilson gauge action with (3 = 5.2 cor- 
responding to ow 0.09 fm. 

In table VI of ref. ^1] integrated auto- 
correlation times for the plaquette, the number 
of iterations N nv of the BiCGStab solver and 
the effective pion mass are given. Among these, 
T~mnv is the largest. For j3 = 2.1 (a = 0.11 



fm) it behaves as Tmnv oc (k c — n)~°- 3 ( 2 \ At 
the two smaller values of f3, we see a slightly 
larger increase of the auto-correlation times, cor- 
responding to the exponents —0.5(1) and —0.4(1). 
JLQCD ^U] report autocorrelation times in table 
IV. In their case, the autocorrelation time of the 
plaquette even decreases as k increases. Next, 
let us discuss the number of iterations required 
by the BiCGStab solver. The numbers for the 
20 3 x 48 lattice given in table I of ref. J5| can be 
fitted as 

N mv = 0.253(4) ( Kc - «)-o.893(2) _ (6) 

The numbers for Ni nv given in table II of ref. [T4*j 
lead to virtually the same exponent. Fitting these 
numbers as a function function of mps, taken 
from table XXIII of ref. gives Ninv °c m ps' 69 ■ 
Finally, let us consider the step-size of the inte- 
gration scheme. Taking the numbers for = 2.1 
from table II of ref. ^3] f° r the step-size, we find 
St oc (n c — k) 0,6 . It is more difficult to extract 
such a result from the numbers of ref. since 
the acceptance rate for the different values of k 
vary quite a bit. Nevertheless it is clear from the 
numbers that the step-size has to be reduced con- 
siderably as k increases. Taking into account the 
increase of N inv and the decrease of St we get 
cost oc rTip Z s with z ss 3, which is similar to the 
results of ref. |18| . 

At the Lattice 2002, A. Irving |231 and Y. 
Namekawa [21] reported about explorative stud- 
ies at light quark masses with ^£^- rj 0.4. De- 
tails of ref. \M will be given below. In ref. 1211 
the Iwasaki gauge action at ft = 1.8 (a w 0.22 
fm) was studied at K-values corresponding to 
mps/mv = 0.6, 0.5 and 0.4. They find that 
the BiCGStab algorithm frequently fails to con- 
verge at the smaller quark masses. This problem 
was overcome by replacing the BiCGStab with 
the BiCGStab(DS-L) algorithm [22, which is a 
generalization of the BiCGStab. From fig. 2 of 
ref. [21] we see that a step size of St = 0.003 
had to be used at nips/my ~ 0.5 for a 12 3 x 24 
lattice. Despite of this small step-size, spikes (i.e. 
extremely large AH) frequently occur (See fig. 
2 of ref. 21 ). They find that a reduction of the 
step-size St rapidly reduces the frequency of these 
spikes. 
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4. MODIFIED PSEUDO-FERMIONS 

It has been observed by various authors that 
replacing the original fermion matrix by the even- 
odd preconditioned one allows for a larger step- 
size St in the HMC at constant acceptance rate. 
This lead to the idea @] to factorized the fermion 
matrix such that each part has a smaller condition 
number than the original fermion matrix. In the 
modified pseudo-fermion action, for each of the 
factors a pseudo-fermion field is introduced. The 
splitting of the fermion matrix M can be written 
as M := W' 1 M . In ref. @J W has been 
chosen as fermion matrix with a smaller hopping- 
parameter than M itself. Equivalently we get 

W = M + p . (7) 

A second choice that we studied is inspired 
by twisted mass QCD 

W = M + ifrf 5 . (8) 

We introduce pseudo-fermions for both W and 
M: 

detMM* oc J V<p\ J Vfa J V<j>\ J V§ 2 

exp(-|W^Vi| 2 - |Af-V 2 | 2 ) (9) 

Hence, the modified pseudo-fermion action is 
then given by 

Sf = Sfi + S F2 = |W - Vi| a + \M^<P2\ 2 • (10) 

For the practicability of the HMC algorithm it is 
important that the variation of Sf can be easily 
computed. For our choices of W, the variation 
of Sfi can be computed in the same way as the 
variation of Sf of eq. Q). Also the variation of 
Sf2 can be explicitely computed: SSf2 = 

XHMY - Y ] SM ] X + XUW4> 2 + <j>\SW^X (11) 

where X = M^M^Wfa, Y = M~ X W^ 2 - 
The experience with the 2D Schwinger model 0] 
has shown that the step-size of the integration 
scheme can indeed be increased by replacing the 
standard by the modified pseudo-fermion action. 
For the largest value of k that we have studied, 
the step-size could be increased by a factor of two, 
while keeping the acceptance rate fixed. For a re- 
lated approach see ref. 0. 



4.1. Numerical results for Lattice QCD 

Recently, detailed studies 2.'! 2 1 SI of the 
modified pseudo-fermion action applied to the 
HMC simulation of Lattice QCD with two 
flavours of dynamical Wilson fermions have been 
carried out. In these studies, the method has been 
applied on top of even-odd preconditioning. I.e. 
the splitting is applied to M. All three studies 
have used the asymmetric version Q and (unfor- 
tunately) not eq. (J5J as recommended by ref. . 
The first question is, whether also here the step- 
size can be increased and how this scales with 
the sea-quark mass and the lattice size. In addi- 
tion, it is important to check, whether there are 
effects of the modified pseudo-fermion action on 
the number of iterations needed by the solver, the 
autocorrelation times and the reversibility of the 
integration scheme. 

In ref. we simulated at (3 = 5.2 with c sw = 
1.76. Note that the final result of ref. is 
c sw — 2.0171. First we studied a 8 3 x 24 lattice 
at k — 0.137. Later we simulated a 16 3 x 24 lattice 
at k = 0.139, 0.1395 and 0.1398. Following ref. 
[2"?| . these values of k correspond to mps/my rj 
0.856,0.792,0.715 and 0.686, respectively. 

We have tested two different integration 
schemes: The standard leap-frog and a partially 
improved one suggested by Sexton and Wein- 
garten (see eq. (6.4) of ref. ^0]) that has a re- 
duced amplitude of the 0{8t 2 ) error compared 
with the leap-frog scheme. We used a trajectory 
length 1 throughout. To find the optimal value of 
the parameter p of the modified pseudo-fermion 
action, we have performed runs for different val- 
ues of p with 100 trajectories each. For all these 
runs we used the same step-size St. We looked 
for the value of p that gives the maximal accep- 
tance rate. This search is not too difficult, since 
there seems to be only one local maximum, which 
is rather broad. 

We have tested both splittings eq. J7J) and 
eq. (JSJ. We found a small advantage for eq. J7J. 
A very important result is that the partially 
improved scheme gains more from the modified 
pseudo-fermion action than the leap-frog scheme. 
In particular for our smallest quark mass, given 
by mps/mv ~ 0.686, the leap-frog performs 
better than the partially improved scheme in 
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the case of the standard pseudo-fermion action, 
while this order is just reversed when the mod- 
ified pseudo-fermion action is used. In partic- 
ular, we find for the 16 3 x 24 lattice with the 
standard pseudo-fermion action and the leap-frog 
scheme that St = 0.01 is needed to get P acc = 
0.77(3). With the modified pseudo-fermion ac- 
tion at the optimal p already St = 0.02 gives 
Pace = 0.76(2). With the partially improved inte- 
gration scheme and the standard pseudo-fermion 
action we get with St = 0.0166... an accep- 
tance rate P acc = 0.82(2). The modified pseudo- 
fermion action with St = 0.066... gives P acc = 
0.74(3). Comparing the two integration schemes, 
we have to note that for one elementary step, 
for the partially improved scheme, the variation 
of the force has to be computed twice as often 
as for the leap-frog. It remains an advantage of 
about (0.066.. .)/(2 * 0.01) = 3.33... for the par- 
tially improved scheme combined with the mod- 
ified pseudo-fermion action, compared with the 
leap-frog and the standard pseudo-fermion ac- 
tion. For a fair comparison, we also have to 
take into account the overhead caused by Sfi of 
the modified pseudo-fermion action. Taking the 
total number of iterations needed by the solver 
within one trajectory we arrive at a speed-up of 
17000/6900 =2.5. Based on this experience it 
would be very interesting to study the behaviour 
of an 0(St 4 ) improved integration scheme. Fur- 
thermore we find that reversibility violations are 
of similar magnitude for the modified pseudo- 
fermion action and the standard pseudo-fermion 
action. For the 8 3 x 24 lattice we performed ex- 
tended runs with up to 8300 trajectories. From 
these runs we see that for the standard and the 
modified pseudo-fermion action as well as for dif- 
ferent integration schemes, the autocorrelation 
times are the same, as long as the acceptance rate 
is the same. 

The QCDSF collaboration has simulated a 
16 3 x 32 lattice at j3 = 5.29 with c sw = 1.9192 and 
K = 0.1355, which corresponds to mps/my ~ 
0.7. Preliminary results for a 24 3 x 48 lattice at 
j3 = 5.25 with c sw = 1.9603 and K — 0.13575 are 
also available. This corresponds to mps/mv ~ 
0.6. They have used the pseudo-fermion action 
(J7J and the leap-frog integration scheme with two 



time-scales (j2Jl. They compared three versions 
of the algorithm: (A) The HMC with the stan- 
dard pseudo-fermion action, (B) the HMC with 
the modified pseudo-fermion action, where both 
Sfi and Sf2 are on the same time-scale and (C) 
analogous to Sexton and Peardon Sfi is on 
the same (smaller) time-scale as the gauge action. 
They have used a chronologicial prediction of the 
start vector for the solver. 

For mps/mv ~ 0.7 they find that (B) is about 
2.8 times faster than (A). Similar to our experi- 
ence |2BJ) the step-size St of version (B) is twice 
the step size of (A). It is very surprising that an 
other factor of 1.4 speed-up comes from the iter- 
ation number of the solver. Naively one would 
expect that the larger step-size of version (B) 
would degrade the chronologicial prediction of the 
start vector and hence enlarge the iteration num- 
ber. Maybe, with the modified pseudo-fermion 
action the evolution of the solution vector be- 
comes smoother and hence the chronologicial pre- 
diction more effective. With version (C), they 
even find a speed-up of about 3.4 compared with 
(A). The preliminary results for mps/my ~ 0.6 
are quite similar: with version (B) the speed-up 
is about 3.5 and with (C) 3.4. But here the pa- 
rameters of the algorithm are not yet tuned much. 
Again the speed-up is partially due to a reduction 
of the iteration number of the solver in the case 
of (B) and (C) compared with (A). 

The ALPHA collaboration studied the 
pseudo-fermion action based on the splitting (JSJ. 
They have used the partially improved integra- 
tion scheme of Sexton and Weingarten. They 
imposed Schrodingcr functional boundary condi- 
tions. They have studied lattices of size L/a = 6 
up to L/a = 24 in a range 1.0 < u < 5.7 of their 
renormalized coupling, corresponding to physical 
lattice sizes of 10 -2 fm up to 1 fm. The range of 
/3-values is 5.2 < /3 < 11.1. 

They have chosen the parameter p of the split- 
ting following the rule p ~ (< \ m in > / < 
X max >) 1/4 , where X min and X max are the min- 
imal and maximal eigenvalues of M^M. The 
idea of this choice is that both M and W should 
have the same condition number. The runs where 
performed on a APEmille computer using single 
precision floating point numbers. Therefore spe- 
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cial attention on reversibility violations is needed. 
Their careful checks show that there is no relevant 
difference in this respect between the two pseudo- 
fermion actions. To check explicitly the effect of 
reversibility violations on the observables, they 
have varied the length of the trajectory. In addi- 
tion to the usual r = 1 they studied r = 0.5, 1.5 
and 3. They conclude that for r = 1, the re- 
versibility violations do not invalidate the results 
of their simulations. 

Since they performed rather long runs, even for 
L/a = 24 they have generated 4000 trajectories, 
they can give meaningful results for the autocor- 
relation times. Measured in units of trajectories, 
for the same acceptance rate, they find virtually 
no difference between the modified and standard 
pseudo-fermion action. 

The main difference comes from the step-size 
that can be used to obtain an acceptance rate 
of about 80%. For the modified pseudo-fermion 
action, the step-size can be more than doubled 
compared with the standard case. There is only 
a rather small increase of this factor towards 
smaller values of (3 and larger L/a. 

As a further test of the modified pseudo- 
fermion action, 1 have performed simulations at 
mps /my ~ 0.4 [2Hj. To this end I have taken the 
parameters reported by Irving at the Lattice 2002 
[27)] . He generated 2300 trajectories for a 16 3 x 32 
lattice at (3 = 5.2, c sw = 2.0171, k = 0.1358. He 
obtained mps/mv — 0.43(2) and m„L = 3.3(1), 
where the last result indicates that the lattice size 
is actually too small to extract L — > oo results 
from the simulation. The result for the plaque- 
tte is Plaq — 0.53770(4) with an autocorrelation 
time of t w 7. The step-size of the leap-frog 
scheme is St — 1/400. The length of the tra- 
jectory is t = 1. 

In my run I have tested only the modified 
pseudo-fermion action based on eq. 0. The par- 
tially improved integration scheme of Sexton and 
Weingarten is used. First I equilibrated the sys- 
tem for the parameters chosen. After some rough 
search for the optimal value of p, I generated 
50 trajectories with 2 pseudo-fermion fields and 
the choice p — 0.05. With a step-size of 1/33, 
Pace = 0.81(4) is obtained. Taking into account 
the fact that in the partially improved scheme the 



force has to be computed twice per step, we see a 
speed-up of 400/(2 x 33) ss 6 compared with the 
run of Irving |20j . 

Then I simulated with 3 pseudo-fermion fields, 
i.e. splitting the fermion matrix into three fac- 
tors. The parameters p\ = 0.4 and p2 — 0.03 
of the corresponding pseudo-fermion action are 
guesses based on the experience with the two 
pseudo-fermion case. The fermion matrix is nor- 
malized such that it is 1 for k = 0. After gen- 
erating about 50 trajectories, we found that a 
larger step size than with two pseudo-fcrmions 
could be reached. Based on that we generated 
1500 trajectories with a step-size of St — 1/25. 
We find an acceptance rate of P acc — 0.809(6). 
The expectation value of the average plaquette is 
Plaq — 0.53773(5), which is compatible with Irv- 
ing's result [201 • The integrated auto-correlation 
time of the plaquette is r = 7.5±3.0. This again 
confirms that auto-correlation times depend very 
little on the pseudo-fermion action that is used. 

In the simulation I have also seen a few spikes. 
The largest is AH « 3774. The second largest 
is AH w 30. These spikes are rather small com- 
pared with those of ref. [2JJ which are of order 
10 5 . Also (exp(-Atf)) = 0.984(10) for my simu- 
lation seems to be fine. This result shows that the 
modified pseudo-fermion action not only allows 
for a larger step-size but also tames the spikes. 

5. THE MULTIBOSON ALGORITHM 

The qq+q collaboration has recently reported 
results for two-flavour Wilson simulations at 
quark masses in the range ^m s < m q < 2m s 
using the TSMB version of the algorithm. A 
discussion of the performance, depending on the 
quark mass, the lattice size and the lattice spac- 
ing, is given m (and refs. therein). The de- 
pendence on the pseudo scalar mass is given by 
cost cx rrip Z s with 3 < z < 4, which seems a little 
worse than the HMC algorithm. This is actu- 
ally not too surprising. The increase of CPU cost 
with decreasing sea-quark mass in the MB algo- 
rithm and variants of it have mainly two sources: 
First the increase of the order of the polynomial 
and second the increase of autocorrelation times 
that are related to the order of the polynomial. 
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The increase of the order of the polynomial should 
be similar to the increase of the number of itera- 
tion of the solver in the HMC algorithm, which is 
the major source for the increasing effort in the 
HMC algorithm. On the other hand, for the MB 
algorithm, the numerical costs increase trivially 
with the lattice volume: cost oc (L/a) 4 , which 
compares favourably with the HMC algorithm, 
where we have cost oc (L/a) 5 for the leap-frog in- 
tegration scheme. The qualitative disadvantages 
of the MB algorithm compared with the HMC 
algorithm are the large memory requirement to 
store the boson-fields and the difficulty to imple- 
ment the algorithm for improved fermion actions. 

6. CONCLUSIONS 

Tuned once more, the HMC algorithm seems 
to be a good choice for the simulation of dynam- 
ical Wilson fermions. One should take great care 
of all details like the precise choice of the inte- 
gration scheme and the preconditioning of the 
fermion matrix. The modified pseudo-fermion ac- 
tion £1] speeds up the HMC simulation by al- 
lowing for a larger step-size of the integration 
scheme. Higher order integration schemes seem 
to profit more from the modified pseudo-fermion 
action than the leap-frog scheme. Hence, it seems 
likely that the scaling of the cost with the volume 
of the lattice can be improved to cost oc (L/a) 4 ' 5 
by using an 0{5t a ) integration scheme, without 
negative side effects on the scaling with mps- 
The modified pseudo-fermion action introduces 
no unwanted side-effects like larger violations of 
reversibility or larger autocorrelation times. The 
proposal of Peardon [H] is more flexible than that 
of ref. 0]. Hence, it might lead to even larger 
speed-ups. Also it is straight forward to apply 5 
to the Polynomial HMC algorithm. Whether new 
ideas [7] will beat the HMC remains to be seen. 
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